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Abstract — One of the difficulties in calculating the capacity of 
certain Poisson channels is that H{\), the entropy of the Poisson 
distribution with mean A, is not available in a simple form. In 
this work we derive upper and lower bounds for H{\) that are 
asymptotically tight and easy to compute. The derivation of such 
bounds involves only simple probabilistic and analytic tools. This 
complements the asymptotic expansions of Knessl (1998), Jacquet 
and Szpankowski (1999), and Flajolet (1999). The same method 
yields tight bounds on the relative entropy D{n,p) between a 
binomial and a Poisson, thus refining the work of Harremoes 
and Ruzankin (2004). Bounds on the entropy of the binomial 
also follow easily. 

Index Terms — asymptotic expansion, binomial distribution, 
central moments, complete monotonicity, entropy bounds, inte- 
gral representation, Poisson channel, Poisson distribution 



I. Introduction 

Unlike the differential entropy for the Gaussian distribution, 
the Shannon entropies for many basic discrete distributions, 
such as the Poisson, the binomial, or the negative binomial, are 
"not in closed form." In the Poisson case, the lack of a simple 
analytic expression is seen ( ||24| . Il22l ) as one of the obstacles 
to obtaining the capacity of certain Poisson channels ((^, fT2\, 
||26l . lUTJ). Computation of the entropy is also a basic problem 
partly motivated by the maximum entropy characterizations 
(El, Ea, m, f29l, |T8|, |30|) of these disti'ibutions. 

One strategy to make the entropy functions more tractable 
is to express them as integrals (Cfol, |l20|, iMl)- See lfT3l . 
II2TI, f23l for related integral representations for entropy-like 
quantities in the context of Poisson channels. For the entropy 
functions themselves, integral representations have been used 
to derive asymptotic expansions (|10|, |20|). Alternatively, 
asymptotic expansions can be obtained using analytic depois- 
sonisation ( |fT6l . IfTTl ). singularity analysis (| 11 ]), or local limit 
theorems {f9\). 

It is obviously desirable to have bounds that accompany 
asymptotic expansions, for both theoretical analysis and nu- 
merical computation. This is especially true for quantities such 
as the entropy of the Poisson law, which may be used, e.g., 
in capacity calculations for discrete-time Poisson channels 
(EH, ED)- P^t of this work aims to derive tight bounds on 
the entropy for fundamental distributions such as the Poisson 
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and the binomial. Our results are expressed in terms of two 
sequences of lower and upper bounds, and have the following 
features. 

• The sequence of upper bounds and the sequence of lower 
bounds each gives a full asymptotic expansion for the 
entropy. In other words the bounds are asymptotically 
tight. 

• The bounds are derived from familiar quantities such as 
the central moments of the Poisson, and are in a form 
simple enough for both theoretical analysis and numerical 
computation. 

• The derivation, which involves only real analysis, is ele- 
mentary. (Note that the asymptotic expansions of Knessl 
t20J are also real-analytic.) 

Denote by Z+ = {0, 1, 2, . . .} and by N = Z+ \ {0}. As 
usual, the Shannon entropy for a discrete random variable X 
on Z+ with mass function /j = Pi-{X = i) is defined as 

00 

H{X) = H{f) = Y^-f dog h, 

i=0 

where we use the natural logarithm and obey the convention 
OlogO = 0. Throughout we use N\ to denote a Poisson 
random variable with mean A, i.e., the mass function is 



We write H{\) = H{N\) for simplicity. The best known 
bound on H{X) is perhaps 



(1) 



which is obtained by bounding the differential entropy of 
N\ + U where U is an independent random variable uniformly 
distributed on (0, 1) (fSl, Theorem 8.6.5). While O is simple 
in form and reasonably accurate, it lacks a corresponding lower 
bound, and does not extend easily to capture higher order terms 
in the expansion of H{\). As a remedy we shall derive, for 
each m > 1, a double inequality of the form 

^^™''^<i?(A)-ilog(2.A)-i-5:^("^'^) 
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where A{in, k), b{m, k), and B{m, k) are explicit constants 
(^'i' = 0). In other words, for each m > 1, we give a finite 
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asymptotic expansion in powers of A^^ with m exact terms 
and explicit lower and upper bounds of the order of A^™. 

In Section we derive (in an equivalent form) the double 
inequality (|2]l. The key steps are 

• an integral representation that relates H{X) to the simpler 
quantity E[\og{Nx + 1)]; 

• bounds on E[\og{N\ + 1)] in terms of polynomials in 
A^^, which translate easily to bounds on H{X). 

Note that (|2]l is only effective for large A. We also obtain 
bounds on H{X) in terms of polynomials in A, which work 
for small A. 

Besides H{X), we also consider bounds on the relative 
entropy between a binomial and a Poisson, thus obtaining a 
version of "the law of small numbers" that refines the results of 
Harremoes and Ruzankin |15|. While these are of theoretical 
interest, they also lead to new bounds for the entropy of the 
binomial. As usual, for two random variables X and Y on Z+ 
with mass functions / and g respectively, the relative entropy 
is defined as 



DiX\\Y) ^ DifWg) 



j=0 



By convention 01og(0/0) = 0, and D{f\\g) = oo if / assigns 
mass outside of the support of g. Throughout we let Bn.p be 
a binomial random variable with mass function 



Pr(B„,p - fc) 



p'=g"-^ fc = 0,l,...,n, 



where q = 1 — p, p e (0, 1) and n £ N. We consider 

Din,p) ^ D{B„J\N^p), 

i.e., the relative entropy between Bn,p and N„p, and derive 
bounds on D{n,p) using similar techniques. Bounds on the 
entropy of the binomial p) are obtained as a corollary. 

Sections III and IV contain proofs of the main results. We 
conclude with a short discussion on possible extensions in 
Section |V] 

II. Main Results 

A. Sharp Bounds on H{X) 

We first present a class of double inequalities for -ff (A) that 
is effective for small A (say A < 1), but valid for all A > 0. 
Theorem 1: For any A > and m = 1, 2, . . ., we have 



2m +1 



E 4r^^<^(A) + AlogA-A<f:^f A^ 
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where 



-■'f'^^. ^')log(j + l)' 3,. 



fc-i 

c(fc)=^(-l)''-i- 
i=o 

For fixed m, the two bounds given by Theorem [T] differ by 
0(A^™+^). Hence they are most effective when A is small. 
Moreover, inspection of the proof (Section III, Part B) shows 
that, for < A < 1, both the upper and lower bounds in 
Theorem [T] converge to H{X) + A log A — A as m — > oo. 



In what follows, the fcth central moment of the Poisson 
distribution 

pik{s)=E[{Ns- sf] 

plays an important role. The first few values of /ifc(s) are 
/^o(s) 1, = 0, and 

M2(s) = M3(s) = s, fii{s) = 3s^ + s, fi5{s) = lOs^ + s. 

They obey the well-known recursion ( |fT9]| , p. 162) 

Mfc(s) = s^( T k>2, (3) 

from which it is easy to show that, for k > 2, fJ,k{s) is 
a polynomial in s of degree [fc/2j, where [x\ denotes the 
integer part of x. 

In contrast to Theorem [l] Theorem |2] is most effective for 
large A. 

Theorem 2: For any A > and m = 1, 2, . . . , we have 

-TmW < HiX) - i log(27rA) - i - /?,„(A) < 0, 



where 



/3m (A) 



and 



(-ly 1 b{m,k) 



k=l 



xf^ 



rmW 



M2m+2(s) 



■ds 
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X ^ a(m, k) 
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Xf^ 
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(5) 



Let us note that the seemingly cumbersome expressions (01 
and (|5]l are actually quite easy to handle. For j > 3, the 
jth central moment /ij(s) is a polynomial in s of degree 
[://2j. This, together with (O, shows that the integrand in 
(IHi is a polynomial in s^^, with powers going from s^^ 
to s^2m Similar statements hold for the integrand in (|5]|. 
Hence the constants a{ni, k) and 6(m, k) are obtained after 
straightforward integration; see Table I for their values for 
small m. In particular, for m = 2 we have 
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We emphasize that the constants b{m, k), 1 < < m — 1, 
are exact in the full asymptotic expansion of H{X), since Q 
gives rm{X) — 0(A^™). For example, we have 6(3,1) = 
-1/12 and 6(3, 2) = -1/24 from Table I, and hence 

^ +0{X-% 



H(X) = - log(27rA) + - - — 

^ ' 2 '2 12X 24A2 



which agrees with the leading terms given by, e.g., Knessl 

(|201, Theorem 2). 

Bounds on H{X) given by Theorem |2] are illustrated in 

Fig. 1. As A increases from 10 to 20, the gap between upper 

and lower bounds, rm{X), decreases 
• from w 0.1 to w 0.05 with m — 1, 
. from w 0.017 to w 0.004 with m = 2, and 
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TABLE I 

Values of a{m, k) and b{m, k) for m = 1, 2, 3, 4. 





a{m, k) 


k 


m = 1 


m = 2 


m = 3 


m = 4 


1 


1 








2 


1/6 


3/2 






3 




5/3 


5 




4 




1/20 


35/2 


105/4 


5 






17/5 


210 


6 






1/42 


2275/18 


7 








167/21 


8 








1/72 


k 


6(m, k) 


1 


1/6 


-1/12 


-1/12 


-1/12 


2 




5/24 


-1/24 


-1/24 


3 




1/60 


103/180 


-19/360 


4 






13/40 


201/80 


5 






1/210 


12367/2520 


6 








571/1008 


7 








1/504 



bounds on H{\) 



gaps between bounds 




Fig. 1. Bounds on H{X) (left) and differences between upper and lower 
bounds (right) given by Theorem |2] with m = 1, 2, 3. 



. from « 0.0068 to « 0.00074 with m = 3. 
In short, for moderate A, the bounds are already quite accurate 
with m as small as 3, and, as expected, the accuracy improves 
as A increases. 

B. Exact Formulae and Sharp Bounds for D{n,p) 

While bounds on Poisson convergence are often stated in 
terms of the total variation distance (|4|, [2|), those based on 
the relative entropy can also be quite effective (Ell, ifTSl ). 
As a discrepancy measure, relative entropy occurs naturally 
in contexts such as hypothesis testing. In this section we 
consider D{n,p) = D{Bn^p\\Nnp), and study its higher-order 
asymptotic behavior. Our main result. Theorem |4] may be 
regarded as a refinement of that of Harremoes and Ruzankin 
ifTSl . As a by-product, we also obtain bounds on the entropy 
of the binomial that parallel Theorem [2] 

Analogous to Theorem [T] we have the following exact 
expansion for D{n,p) as a function of p. 

Theorem 3: Fix n e N. For p E [0, 1] we have 



D{n,p) = n{p + q\ogq) +^ 

k=2 



~c{k)p\ 



^ ^)log(n-j), fc = 2, 



where 

fc-i 

c(fc) = ^(-lf-i- 
i=0 

In the following result, the fcth central moment of i3„ p, 

Hk{n,p) = E[{B.n.,p ~ np)% 
plays an important role. The first few values of iih{n,p) are 

fio{n,p) = 1, fJ.i{n,p) = 0, fJ.2{n,p) = pqn, 
fJ-3{n,p) = {q-p)pqn, fi4{n,p) = Ojpqnf + (1 - 6pq)pqn. 

We write q = 1 — p throughout. 

Analogous to Theorem |2] Theorem |4] gives bounds on 
D{n,p) that are effective for large n. 

Theorem 4: For n, to e N and p e (0, 1), we have 

< D{n,p) + P^^^^^ _ p^{n,p) < fm{n,p), 



where 



•"1 1=3 ■'^■^ ' k=l 



^ b{m, k;p) 



and 

fyn{n,p) = n 



(2to+ l)(ns)2™+2 



^ a{m,k;p) 

Z-^ nk 



k—m 



The integrals that define /3m{n,p) and fm{n,p) are easy to 
calculate, because fij{n,s) is a polynomial in s. We obtain 
the coefficients b{m, k;p) and a{m, k\p) after integrating and 
assembling the results in powers of For example, we 
have (m = 1) 

2 3 6q 6' 
1 



6(1, 



a(l,l;p) = 21og9 -(?+-, 



a(l,2;p) 
and (to — 2) 
6(2, l;p) 
6(2, 2;p) 
6(2, 3;p) 
a(2,2;p) 



-A\ogq + 2q 
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I2q 

t: log <? - t: 9 
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3(7 Qq^ 6 ' 



17 



(6) 



2 " I2q 24g2 24' 
3 1 



113 



9 3 9 



-91ogq + 3g-- + -^ + -, 
q 2g^ 2 

a(2,3;p) = 781ogg- 26g+ ^ - + - 

a(2,4;p) = -721og(? + 24g-- + 4--^ 

q q^ loq'' 



1 



2281 



20q-i 60 ■ 

As in Theorem |2] we emphasize that, for each to > 2, the 
constants b{m,k;p), 1 < fc < m — 1, are exact in the full 
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asymptotic expansion of D{n,p) (for fixed p as n — s- oo). In 
particular, from (|6|l we get 



where 



D{n,p) 



p + log ( 



Uqn 



0{n- 



for fixed p € (0,1). We can also estimate the rate at which 
D{n,X/n) decreases to zero, for fixed A > as n — oo, 
which corresponds to the usual binomial-to-Poisson conver- 
gence. Indeed, setting m = 1 and p = X/n, Theorem |4] yields, 
after routine calculations. 

Din, X/n) = ^ + 0(n-3), 

which can be further refined by using larger m. 

Such results are related to those of Harremoes and Ruzankin 
flS], who give several bounds on D{n,p) after detailed 
analyses on inequalities involving Stirling numbers. TheoremH] 
may be regarded as a refinement in that, by letting m be 
arbitrary, it can give a full asymptotic expansion of D{n,p), 
with computable bounds. Our derivation is also simpler (see 
Section IV). 

Let us denote the entropy of the binomial by H{n,p) — 
H{Bn,p). In the special case p — 1/2, H{n,p) appears as the 
sum capacity of a noiseless n-user binary adder channel as 
analyzed by Chang and Weldon fT|, who also provide simple 
bounds on H{n,p). For general p, asymptotic expansions for 
H{n,p) have been obtained by Jacquet and Szpankowski ifTTl 
and Knessl ll20l (see also Flajolet ITTl ). We present sharp 
bounds on H{n,p) that complement such expansions. As it 
turns out, due to an elementary identity (see ( |29l ) below), 
bounds on D{n,p) obtained in Theorem H] translate directly 
to those on H{n,p), thus simplifying our analysis. 

Corollary 1: Let n, m e N and p e (0, 1). Define 



H(n,p) = H{n,p) — logn! + nlogn — n 



1 + log{pq) 



Then 



2m -1 



< -H{n,p) - J2 



b{m, k;p) + b{m, k; q) 



E^'" a{m,k;p) + a{m,k;q) 

k—m 



where d{rn, k;p) and b{ni, k;p) are defined as in Theorem]?] 
Bounds on H{n,p) can also be expressed in terms of logn 
and n"*^, k = 1,2,..., via familiar bounds on logn!. For 
example, taking to = 1 in Corollary JT] and using (see HI, 
6.1.42) 



< log nl — n log n 

Yin 360n3 ^ ^ 

we get 



- log(27rn) < , 

2 ' 12n' 



9l 

n^ 



1 1 C 

< H{n,p) - - \og{2TTnpq) - - < ^, (7) 

2 2 n 



Ci = 

C2 = 
C3 = 



13 31og(pg) 5 
12 2 6m' 

-J+41ogbg) + A__L_, 

1 

~360' 

12 2 6to' 



To relate (]7]l to the Poisson case, i.e.. Theorem ]2] let us fix 

A > and set p = X/n. Then, as n — > oo, we have H{n,p) — > 
H{X), and in the limit ^ becomes 

-^-6X^^^(^)4^°^(2'^^)-^^^' 

which is precisely Theorem]2]with to = 1. In general (to > 1), 
Theorem ]2] may be seen as a limiting case of Corollary JT] 

As before, for each m > 2, the coefficients b{m,k;p) + 
b{m,k;q), k = 1,...,to — 1, are exact in the asymptotic 
expansion of H{n,p) (for fixed p as n oo). For large n, 
we may choose larger to to improve the accuracy. 

III. Proofs OF Theorems [T] AND ]2] 

A. Preliminaries 

Given a function (j> : R — > R, the TOth forward difference 
of (j> is defined recursively by 

AV(^) = Hx), A"+V(2;) = A'"0(a; + 1) - A'"</)(a;), 

for any a; € R and to G Z+. Equivalently, we have 



A"V(x) 



3=0 ^ 



{x + j), x e R, TO e 



(8) 

(This definition extends to functions defined only on Z+.) As 
usual, stands for the TOth derivative of (p. If (j) is infinitely 
differentiable, then 



(9) 



where Sm ^ Ui + ■ ■ ■ + Um and {Uk)k>i is a sequence of in- 
dependent and identically distributed (i.i.d.) random variables 
having the uniform distribution on [0, 1] (cf. |3|, Eqn. (2.7)). 

On the other hand, two special properties of the Poisson 
distribution are 

XE[<j){Nx + I)] ^ E[Nxq}{Nx)], A > 0, (10) 
and (see 13], Theorem 2.1, for example) 

d"^[0(7VA)] 



dA'^ 



= S[A'"(/.(iVA)], A>0, TOeZ+. (11) 



In both ( [Tot and (ITTl i. (p ■ Z+ ^ R is an arbitrary function 
for which the relevant expectations exist. From (fTTl) we have 
the Taylor formula 



k=0 



kl 



^ / (A-7/)™£;[A™+V(iV„)]du, (12) 
ml Jo 



for any A > and to e Z_|.. 
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B. Proof of Theorem |7] 
Let A > 0. We have 



Proof: By ( fTOl i we have 



E 



H{\) = \ - \ \og\ + E\\ogNx\ 



(13) 



log- 



^ -{E[N slog Ns\- slogs) 



(18) 



by definition. Applying ( fT2b to the function (t>{k) — log kl, k E 
Z_|-, we obtain 



Taking into account that E[Ns] = s, the lower bound follows 
from ( fTSl l and the inequality (upon letting x — Ng) 



i/(A) + AlogA-A=^ 



2m+l 



fe=2 



xlogx — s logs > (logs + l)(a; — s) + ^ 



k=2 



(-l)fc(a;-s)fe 
fc(fc-l)s'=-i ' 



m! 



+ — / iX-urE[A"'+'(b{Nu)]du 



(14) 



for any TO = 2, 3, ... , since A°0(O) = AV(0) = 0. 

On the other hand, letting g{x) — log{x + 1), a; > 0, and 
using (lUl and (|9]l, we can write 



\ m— 1 



(15) 



-^-^(^^. ^VogO + l)=c(fc), (16) 



where i e Z+ and m = 1, 2, . . .. Therefore, 

fc-i 

AV(0) = ^(-i)'=-i 

for k — 2, 3, . . .. The conclusion follows from ( fT4l i and ( fT6] l 
by noting that, based on ( fTSl l, the integral in (fT4l l alternates in 
sign for m = 1, 2, . . .. 

C. Proof of Theorem |2] 

Some auxiliary results are needed in the proof of Theorem 
121 In Lemmas [T| and |2] we denote 

i/(A) = i/(A) - 1 log(2^A) - i A>0, (17) 

for convenience. 

Lemma 1: We have ^(A) = 0(A~^), as \ ^ oo. 

Proof: See fTOl or f2Ul. ■ 

Lemma |2] expresses the quantity of interest in terms of an 
easier quantity, i?[log(A^s + 1)]- 

Lemma 2: For A > we have 



HiX) = 



OO / 1 

E 

2s 



lof 



Ns + 1 



ds. 



Proof: Recalling ( fT3] ) and using ( fTTT ) with = 
log fc!, k E Z+ and m = 1, we obtain from ( [TtI i 

1 



i/'(A) -i;[log(7VA + l)]-logA 



2A' 



By Lemma [U H{oo) = 0, and the claim follows. 

The next result presents sharp bounds on i?[log(A^s + 1)] 
Proposition 1: For s > and to e N, we have 

2m+l 



0< E 



log 



Ns + l 



^ (-l)Vfc(s) ^ 



M2m+2(s) 



which holds for a; > 0, s > 0. 

On the other hand, using the inequality 

2m+l 

-a; , a: > -1, 

k 

fc=i 



log(l + x) < ^ 



with X = (A^s — s + 1) /s, we obtain 



lot 



< 



2m+l 

E 



(-l )fe-i^[(jV,-s + l)"] 
/cs*^ 



(19) 



(20) 



(21) 



Again by ( fTOl i, we have 

sE[iN, - s + 1)^-] = E[N,iN, - s)^] = /ifc+i(s) + s//fc(s). 
Hence the right-hand side of ( l2Tl i is equal to 

/• \ 2m+l , N fc /X 

M2m+2(s) >r;^ (-1) ^fc(s) 



(2to+1)s2™+2 ' ^ fc(fc-l)s'=' 

which proves the upper bound. ■ 
Remark. The quantity i?[log(A^s + 1)] already appears as 
Example 2 of Entry 10 in Chapter 3 of Ramanujan's second 
notebook (|5|). Proposition [T] gives a finite expansion of 
E[log{Ns + 1)] with an explicit upper bound of the order 
of s"'™"*"^^. See [1311 for related work on this particular entry 
of Ramanujan. 

Proof of Theorem |2} The claim follows directly from 
Lemma 12] and Proposition [T] upon noting /i2(s) = s. ■ 

IV. Proofs of Theorems[3]and|4]and Corollary[T] 

A. Preliminaries 

Let us recall two special properties of the binomial vari- 
able Bn,p- For n e N, p e [0, 1], and any function (p : 
{0, 1, . . . , n} — > R, we have 



npE[^{Bn-l,p + 1)] = E[Bn,p(l>{Bn,p)] 

as well as (see 121, Theorem 1, for example) 



kH 'j^]E[A'' 



I — fc,p)]; ^ 



(22) 

0,1,..., n; 
(23) 



= 0, k = n + l,n + 2,. . 



k=2 



k{k-l)s'' - (2to + l)s2™+2 ■ 



d''E[4>{B^^p)] 
From ( |23l l we obtain the Taylor formula 

E[<t^{Br.,p)] = Qi?[A''>(B„_,,0](P - t)\ (24) 
where p,t E [0, 1]. 
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B. Proof of Theorem |5] 
By direct calculation 



D{n,p)^n{p + q\ogq)-np\ogn + E log- — - 

[ (n~Bn,p)\ 

(25) 

Using (EUl with = log(n!/(n - /)!), ? = 0, . . . , n and 
t = 0, we see that the expectation in dZST l is equal to 

^^[0(^r.p)] QaV(O)/, pe[0,l]. (26) 

fc=0 ^ ^ 

Denote by g{l) — login — Z), Z = 0, 1, . . . , n — 1. Observe that 

A'=(?!) = A'^-^g, fc e N, AV(0)=0, A^(/>(0) = logn. 
Therefore, we have from (|8]l 



A'=0(O) =A'^-\g(0) 
fc-i 

for k — 2, . . . ,n. The claim follows from ( |25] l. ( |26] l. and ( |27] l 



C. Proof of Theorem |4] 

The following integral representation is crucial in the proof 
of Theorem |4] 

Lemma 3: For any p e [0, 1] we have 



D{n,p) — n E 



log- 



ds. 



Proof: Differentiating (|25] l once and applying ( |23] ) with 

A: = 1 and = log(n!/(n — ^)!), we get 



AD{n,p) 
dp 



n£'[log(n - B„_i,p)] -n\og{nq). 



Since £'(n,0) = 0, we have 
D{n,p) r {E[\og{n - - log(n(l - t))) dt. 

JQ 

Note that n — Bn-i,t and i3„_i_i_t + 1 have the same 
distribution. The claim follows by a change of variables 
s = l-t. ■ 
Proposition |2] gives upper and lower bounds on the key 
quantity £'[log((i?„_i.s + l)/(ns))]. This parallels Proposi- 
tion [T] 

Proposition 2: For any m G N and s e (0,1) we have 



Bn-l,s + 1 



Q<E \oi 

ns 

^ M2m+2(n,s) 



2m+l 

E 



(-l)Vfc(n,5) 
fc(fc- l)(ns)fe 



(2m+ l)(7is)2'»+2' 
Proof: By (|22] | we have 

nsS[log(B„_i,, + 1)] = S[B„,,log(B„,,)]. 
The lower bound follows from this and fT% . 



As in the proof of Lemma |2] using ( l20l i with 

i?„_i.s + 1 - ns 



ns 



we obtain 



E 



log- 



2m+l 
k=l 



(-l)'=-i^[(B„_i,,-ns + l)*^] 



(28) 



Again by ( |22] ). we have 

nsS[(B„_i^, - ns + 1)'=] = £;[S„,,(B„,, - ns)''] 

= /ife+i(n, s) +nsnk{n,s). 

This, in conjunction with ( |28] |, proves the upper bound. ■ 
Proof of Theorem |?} Note that 

^ f^2{n,s) ^ P + log g 
2(ns)2 ' 2 ■ 

The claim then follows from Lemma |3] and Proposition |2] ■ 

D. Proof of CorollaryU} 

From the definitions we have 

H{n,p) — logn! — nlogn + n — D{n,p) — D(n,q), (29) 

where q = I ~p as before. Hence, Corollary[T]is an immediate 
consequence of ( |29] ) and Theorem [H ■ 

V. Discussion 

We have obtained asymptotically sharp and readily com- 
putable bounds on the Poisson entropy function H{X). The 
method also handles the entropy of the binomial, and the 
relative entropy D{n,p) between the binomial(n,p) and 
Poisson (np) distributions, yielding full asymptotic expansions 
with explicit constants. While some results are of theoretical 
interest, bounds on the entropy are intended to aid channel 
capacity calculations for discrete-time Poisson channels, for 
example. 

Besides entropy calculations, the method also extends to 
quantities such as the fractional moments of these familiar 
distributions. An example is E[^/N\], which appears as Ex- 
ample 1, Entry 10, in Chapter 3 of Ramanujan's notebook 
(|5|). Analogous to Lemma |2l a double inequality can be 
obtained for E[^/Nx] (details omitted). This is of some 
statistical interest because it gives accurate bounds on the 
bias associated with the square root transformation, which is 
variance-stabilizing for the Poisson. 

For theoretical considerations, the Taylor formulae, integral 
representations, and related results can also help to establish 
interesting monotonicity properties of quantities such as H{X). 
For example, it can be shown that H'{X) is a completely 
monotonic function of A, i.e., (— l)'^~^_ff('=)(A) > for all k > 
1. It appears that many entropy-like functions associated with 
classical distributions are completely monotonic ( ||32| ). One 
conjecture (|33|, Conjecture 1) states that D{n,X/n), n > X, 
is completely monotonic in n for fixed A > 0. The method of 
this work may prove useful toward resolving such conjectures. 
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